Antonio Coín Castro

Linear - p free

New functions

Importante

Example dataset

Common model hyperparameters

Sklearn model comparison

Maximum Likelihood Estimator

The Ensemble Sampler and the emcee library

Experiments

We set up the initial points of the chains to be in a random neighbourhood around the MLE to increase the speed of convergence.

Analysis

Out-of-sample predictions

We can perform a couple of visual posterior predictive checks. In particular:

We also show the Bayesian p-value for several statistics, which is defined as $P(T(y^*)\leq T(y)\mid y)$, and is computed by simply measuring the proportion of generated samples $\{T(Y^*_m)\}_m$ that fall below the real value of the statistic. It is expected to be around $0.5$ when the model accurately represents the data.

Save & Load

This is only for testing purposes; in a production environment one should use the Backends feature of emcee.

The PyMC library

Model

Experiments

Analysis

Since the tuning iterations already serve as burn-in, we keep the whole trace. In addition, we could consider thinning the samples.

Out-of-sample predictions

First we take a look at the distribution of predictions on a previously unseen dataset.

Next we look at the MSE when using several point-estimates for the parameters, as well as the mean of the posterior samples.

Save & Load

Notebook metadata